22 research outputs found

    Estrategias de paralización para la optimización de métodos computacionales en el descubrimiento de nuevos fármacos.

    Get PDF
    El descubrimiento de fármacos es un proceso largo y costoso que involucra varias etapas; entre ellas destaca la identificación de candidatos a fármacos; es decir moléculas potencialmente activas para neutralizar una determinada proteína involucrada en una enfermedad. Esta etapa se fundamenta en la optimización del acoplamiento molecular entre un receptor y un ingente número de candidatos a fármacos, para determinar cuál de estos candidatos obtiene una mayor intensidad en el acoplamiento. El acoplamiento molecular entre dos compuestos bioactivos está sujeto a una serie de fenómenos físicos presentes en la naturaleza y que se modelan a través de una función de scoring. Estos modelos representan los comportamientos de las moléculas en la naturaleza, permitiendo trasladar esta interacción molecular a una simulación en plataformas computacionales de silicio. Esta tesis doctoral plantea la aceleración y mejora de los métodos de descubrimiento de nuevos fármacos mediante técnicas de inteligencia artificial y paralelismo. Se propone un esquema metaheurístico parametrizado y paralelo que determine la interacción molecular entre compuestos bioactivos. Las técnicas metaheurísticas son técnicas algorítmicas empleadas, generalmente, en la optimización de cualquier tipo de problema, proporcionando soluciones satisfactorias. Algunos ejemplos de metaheurísticas incluyen búsquedas locales; que centran su campo de actuación a su entorno de soluciones (vecinos) más cercanos; búsquedas basadas en poblaciones muy utilizadas en la simulación de procesos biológicos y entre los que destacan los algoritmos evolutivos o las búsquedas dispersas por mencionar algunos ejemplos. Los esquemas parametrizados de metaheurísticas definen una serie de funciones básicas (Inicializar, Fin, Seleccionar, Combinar, Mejorar e Incluir) a fin de parametrizar el tipo de metaheurística concreta a instanciar en cada ejecución de la aplicación, permitiendo así no sólo la optimización del problema a resolver, sino también del algoritmo empleado para su resolución. Trabajar con una combinación de parámetros u otra es un factor vital para encontrar una buena solución al problema. Para abordar este número elevado de parámetros necesitamos alguna estrategia para este nuevo problema de optimización que surge. Esta estrategia es la hiperheurística, que busca la mejor de entre un conjunto de metaheurísticas aplicadas a un mismo problema. La gran mayoría de algoritmos metaheurísticos son, por definición, masivamente paralelos, y por tanto su implementación en plataformas secuenciales compromete tanto la eficiencia como la eficacia de los mismos. En ésta tesis doctoral se adapta además la instanciación del esquema metaheurístico a plataformas masivamente paralelas y heterogéneas como procesadores de memoria compartida y tarjetas gráficas. Las técnicas masivamente paralelas en GPU con soporte CUDA ayudan a realizar estos cálculos poniendo a disposición de la aplicación miles de núcleos capaces de funcionar en paralelo y, además, con la posibilidad de compartir memoria entre ellos y así reducir aún más los accesos a memoria. Aun así, existen compuestos celulares de decenas de miles de átomos para los que el uso de una sola GPU puede ser insuficiente, convirtiéndola en un cuello de botella. Esto hace necesario extender el esquema a multiGPU para dividir la carga computacional y poder abordar este tipo de compuestos con suficientes garantías de rendimiento. Para mejorar el rendimiento y maximizar la paralelización de la aplicación, es fundamental aprovechar al máximo los recursos que nos ofrece la máquina, por ello, se realiza un trabajo previo para ajustar los parámetros de la opción paralela elegida al entorno de ejecución y trabajar con los parámetros que mejor se adapten a la máquina. En un nodo, podemos tener un número limitado de GPUs, y para simular una molécula podemos obtener buenos rendimientos, pero en el problema de descubrimiento de fármacos, podemos tener millones de candidatos a fármacos con los que simular. En este caso, escalamos a un clúster de cómputo. Uno de los enfoques tomados por la comunidad para aprovechar todos los recursos de un clúster de computadores, de manera transparente al usuario, ha sido la virtualización del sistema. Entornos como (VMWARE, XEN) virtualizan todo el sistema y no solo una parte, siendo muy inadecuado en entornos de computación de alto rendimiento, ya que las restricciones a que deben someterse al ser un entorno compartido, introducen una sobrecarga que no es posible asumir. En lugar de virtualizar todo el sistema, sería virtualizar solo un conjunto de recursos específicos, como las GPUs. Este trabajo lo realiza un middleware muy potente denominado rCUDA. Este software permite el uso simultáneo y remoto de GPUs con soporte CUDA. Para habilitar la aceleración remota de GPUs, este software del sistema crea dispositivos virtuales compatibles con CUDA en máquinas sin GPUs locales. Además, rCUDA aporta una reducción de la complejidad algorítmica, evitando utilizar técnicas basadas en paso de mensajes (MPI), muy utilizadas en este tipo de entornos de cómputo. Las técnicas algorítmicas que se van a desarrollar, van a posibilitar la elección de las diferentes plataformas de cómputo disponibles optimizando el entorno de ejecución y, balanceando la carga de trabajo con los parámetros de configuración más idóneos.Ingeniería, Industria y Construcció

    Evaluation of Clustering Algorithms on HPC Platforms

    Full text link
    [EN] Clustering algorithms are one of the most widely used kernels to generate knowledge from large datasets. These algorithms group a set of data elements (i.e., images, points, patterns, etc.) into clusters to identify patterns or common features of a sample. However, these algorithms are very computationally expensive as they often involve the computation of expensive fitness functions that must be evaluated for all points in the dataset. This computational cost is even higher for fuzzy methods, where each data point may belong to more than one cluster. In this paper, we evaluate different parallelisation strategies on different heterogeneous platforms for fuzzy clustering algorithms typically used in the state-of-the-art such as the Fuzzy C-means (FCM), the Gustafson-Kessel FCM (GK-FCM) and the Fuzzy Minimals (FM). The experimental evaluation includes performance and energy trade-offs. Our results show that depending on the computational pattern of each algorithm, their mathematical foundation and the amount of data to be processed, each algorithm performs better on a different platform.This work has been partially supported by the Spanish Ministry of Science and Innovation, under the Ramon y Cajal Program (Grant No. RYC2018-025580-I) and by the Spanish "Agencia Estatal de Investigacion" under grant PID2020-112827GB-I00 /AEI/ 10.13039/501100011033, and under grants RTI2018-096384-B-I00, RTC-2017-6389-5 and RTC2019-007159-5, by the Fundacion Seneca del Centro de Coordinacion de la Investigacion de la Region de Murcia under Project 20813/PI/18, and by the "Conselleria de Educacion, Investigacion, Cultura y Deporte, Direccio General de Ciencia i Investigacio, Proyectos AICO/2020", Spain, under Grant AICO/2020/302.Cebrian, JM.; Imbernón, B.; Soto, J.; Cecilia-Canales, JM. (2021). Evaluation of Clustering Algorithms on HPC Platforms. Mathematics. 9(17):1-20. https://doi.org/10.3390/math917215612091

    METADOCK: A parallel metaheuristic schema for virtual screening methods

    Get PDF
    Virtual screening through molecular docking can be translated into an optimization problem, which can be tackled with metaheuristic methods. The interaction between two chemical compounds (typically a protein, enzyme or receptor, and a small molecule, or ligand) is calculated by using highly computationally demanding scoring functions that are computed at several binding spots located throughout the protein surface. This paper introduces METADOCK, a novel molecular docking methodology based on parameterized and parallel metaheuristics and designed to leverage heterogeneous computers based on heterogeneous architectures. The application decides the optimization technique at running time by setting a configuration schema. Our proposed solution finds a good workload balance via dynamic assignment of jobs to heterogeneous resources which perform independent metaheuristic executions when computing different molecular interactions required by the scoring functions in use. A cooperative scheduling of jobs optimizes the quality of the solution and the overall performance of the simulation, so opening a new path for further developments of virtual screening methods on high-performance contemporary heterogeneous platforms.Ingeniería, Industria y Construcció

    Enhancing large-scale docking simulation on heterogeneous systems: An MPI vs rCUDA study

    Full text link
    [EN] Virtual Screening (VS) methods can considerably aid clinical research by predicting how ligands interact with pharmacological targets, thus accelerating the slow and critical process of finding new drugs. VS methods screen large databases of chemical compounds to find a candidate that interacts with a given target. The computational requirements of VS models, along with the size of the databases, containing up to millions of biological macromolecular structures, means computer clusters are a must. However, programming current clusters of computers is no easy task, as they have become heterogeneous and distributed systems where various programming models need to be used together to fully leverage their resources. This paper evaluates several strategies to provide peak performance to a GPU-based molecular docking application called METADOCK in heterogeneous clusters of computers based on CPU and NVIDIA Graphics Processing Units (GPUs). Our developments start with an OpenMP, MPI and CUDA METADOCK version as a baseline case of cluster utilization. Next, we explore the virtualized GPUs provided by the rCUDA framework in order to facilitate the programming process. rCUDA allows us to use remote GPUs, i.e. installed in other nodes of the cluster, as if they were installed in the local node, so enabling access to them using only OpenMP and CUDA. Finally, several load balancing strategies are analyzed in a search to enhance performance. Our results reveal that the use of middleware like rCUDA is a convincing alternative to leveraging heterogeneous clusters, as it offers even better performance than traditional approaches and also makes it easier to program these emerging clusters.This work is jointly supported by the Fundacion Seneca (Agencia Regional de Ciencia y Tecnologia, Region de Murcia) under grant 18946/JLI/13, and by the Spanish MEC and European Commission FEDER under grants TIN2015-66972-C5-3-R and TIN2016-78799-P (AEI/FEDER, UE). We also thank NVIDIA for hardware donation under GPU Educational Center 2014-2016 and Research Center 2015-2016. Furthermore, researchers from Universitat Politecnica de Valencia are supported by the Generalitat Valenciana under Grant PROMETEO/2017/077. Authors are also grateful for the generous support provided by Mellanox Technologies Inc.Imbernón, B.; Prades Gasulla, J.; Gimenez Canovas, D.; Cecilia, JM.; Silla Jiménez, F. (2018). Enhancing large-scale docking simulation on heterogeneous systems: An MPI vs rCUDA study. Future Generation Computer Systems. 79:26-37. https://doi.org/10.1016/j.future.2017.08.050S26377

    High-throughput fuzzy clustering on heterogeneous architectures

    Full text link
    [EN] The Internet of Things (IoT) is pushing the next economic revolution in which the main players are data and immediacy. IoT is increasingly producing large amounts of data that are now classified as "dark data'' because most are created but never analyzed. The efficient analysis of this data deluge is becoming mandatory in order to transform it into meaningful information. Among the techniques available for this purpose, clustering techniques, which classify different patterns into groups, have proven to be very useful for obtaining knowledge from the data. However, clustering algorithms are computationally hard, especially when it comes to large data sets and, therefore, they require the most powerful computing platforms on the market. In this paper, we investigate coarse and fine grain parallelization strategies in Intel and Nvidia architectures of fuzzy minimals (FM) algorithm; a fuzzy clustering technique that has shown very good results in the literature. We provide an in-depth performance analysis of the FM's main bottlenecks, reporting a speed-up factor of up to 40x compared to the sequential counterpart version.This work was partially supported by the Fundacion Seneca del Centro de Coordinacion de la Investigacion de la Region de Murcia under Project 20813/PI/18, and by Spanish Ministry of Science, Innovation and Universities under grants TIN2016-78799-P (AEI/FEDER, UE), RTI2018-096384-B-I00, RTI2018-098156-B-C53 and RTC-2017-6389-5.Cebrian, JM.; Imbernón, B.; Soto, J.; García, JM.; Cecilia-Canales, JM. (2020). High-throughput fuzzy clustering on heterogeneous architectures. Future Generation Computer Systems. 106:401-411. https://doi.org/10.1016/j.future.2020.01.022S401411106Waldrop, M. M. (2016). The chips are down for Moore’s law. Nature, 530(7589), 144-147. doi:10.1038/530144aCecilia, J. M., Timon, I., Soto, J., Santa, J., Pereniguez, F., & Munoz, A. (2018). High-Throughput Infrastructure for Advanced ITS Services: A Case Study on Air Pollution Monitoring. IEEE Transactions on Intelligent Transportation Systems, 19(7), 2246-2257. doi:10.1109/tits.2018.2816741Singh, D., & Reddy, C. K. (2014). A survey on platforms for big data analytics. Journal of Big Data, 2(1). doi:10.1186/s40537-014-0008-6Stephens, N., Biles, S., Boettcher, M., Eapen, J., Eyole, M., Gabrielli, G., … Walker, P. (2017). The ARM Scalable Vector Extension. IEEE Micro, 37(2), 26-39. doi:10.1109/mm.2017.35Wright, S. A. (2019). Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems. Future Generation Computer Systems, 92, 900-902. doi:10.1016/j.future.2018.11.020Jain, A. K., Murty, M. N., & Flynn, P. J. (1999). Data clustering. ACM Computing Surveys, 31(3), 264-323. doi:10.1145/331499.331504Lee, J., Hong, B., Jung, S., & Chang, V. (2018). Clustering learning model of CCTV image pattern for producing road hazard meteorological information. Future Generation Computer Systems, 86, 1338-1350. doi:10.1016/j.future.2018.03.022Pérez-Garrido, A., Girón-Rodríguez, F., Bueno-Crespo, A., Soto, J., Pérez-Sánchez, H., & Helguera, A. M. (2017). Fuzzy clustering as rational partition method for QSAR. Chemometrics and Intelligent Laboratory Systems, 166, 1-6. doi:10.1016/j.chemolab.2017.04.006H.S. Nagesh, S. Goil, A. Choudhary, A scalable parallel subspace clustering algorithm for massive data sets, in: Proceedings 2000 International Conference on Parallel Processing, 2000, pp. 477–484.Bezdek, J. C., Ehrlich, R., & Full, W. (1984). FCM: The fuzzy c-means clustering algorithm. Computers & Geosciences, 10(2-3), 191-203. doi:10.1016/0098-3004(84)90020-7Havens, T. C., Bezdek, J. C., Leckie, C., Hall, L. O., & Palaniswami, M. (2012). Fuzzy c-Means Algorithms for Very Large Data. IEEE Transactions on Fuzzy Systems, 20(6), 1130-1146. doi:10.1109/tfuzz.2012.2201485Flores-Sintas, A., Cadenas, J., & Martin, F. (1998). A local geometrical properties application to fuzzy clustering. Fuzzy Sets and Systems, 100(1-3), 245-256. doi:10.1016/s0165-0114(97)00038-9Soto, J., Flores-Sintas, A., & Palarea-Albaladejo, J. (2008). Improving probabilities in a fuzzy clustering partition. Fuzzy Sets and Systems, 159(4), 406-421. doi:10.1016/j.fss.2007.08.016Timón, I., Soto, J., Pérez-Sánchez, H., & Cecilia, J. M. (2016). Parallel implementation of fuzzy minimals clustering algorithm. Expert Systems with Applications, 48, 35-41. doi:10.1016/j.eswa.2015.11.011Flores-Sintas, A., M. Cadenas, J., & Martin, F. (2001). Detecting homogeneous groups in clustering using the Euclidean distance. Fuzzy Sets and Systems, 120(2), 213-225. doi:10.1016/s0165-0114(99)00110-4Wang, H., Potluri, S., Luo, M., Singh, A. K., Sur, S., & Panda, D. K. (2011). MVAPICH2-GPU: optimized GPU to GPU communication for InfiniBand clusters. Computer Science - Research and Development, 26(3-4), 257-266. doi:10.1007/s00450-011-0171-3Kaltofen, E., & Villard, G. (2005). On the complexity of computing determinants. computational complexity, 13(3-4), 91-130. doi:10.1007/s00037-004-0185-3Johnson, S. C. (1967). Hierarchical clustering schemes. Psychometrika, 32(3), 241-254. doi:10.1007/bf02289588Saxena, A., Prasad, M., Gupta, A., Bharill, N., Patel, O. P., Tiwari, A., … Lin, C.-T. (2017). A review of clustering techniques and developments. Neurocomputing, 267, 664-681. doi:10.1016/j.neucom.2017.06.053Woodley, A., Tang, L.-X., Geva, S., Nayak, R., & Chappell, T. (2019). Parallel K-Tree: A multicore, multinode solution to extreme clustering. Future Generation Computer Systems, 99, 333-345. doi:10.1016/j.future.2018.09.038Kwedlo, W., & Czochanski, P. J. (2019). A Hybrid MPI/OpenMP Parallelization of KK -Means Algorithms Accelerated Using the Triangle Inequality. IEEE Access, 7, 42280-42297. doi:10.1109/access.2019.2907885Li, Y., Zhao, K., Chu, X., & Liu, J. (2013). Speeding up k-Means algorithm by GPUs. Journal of Computer and System Sciences, 79(2), 216-229. doi:10.1016/j.jcss.2012.05.004Saveetha, V., & Sophia, S. (2018). Optimal Tabu K-Means Clustering Using Massively Parallel Architecture. Journal of Circuits, Systems and Computers, 27(13), 1850199. doi:10.1142/s0218126618501992Djenouri, Y., Djenouri, D., Belhadi, A., & Cano, A. (2019). Exploiting GPU and cluster parallelism in single scan frequent itemset mining. Information Sciences, 496, 363-377. doi:10.1016/j.ins.2018.07.020Krawczyk, B. (2016). GPU-Accelerated Extreme Learning Machines for Imbalanced Data Streams with Concept Drift. Procedia Computer Science, 80, 1692-1701. doi:10.1016/j.procs.2016.05.509Fang, Y., Chen, Q., & Xiong, N. (2019). A multi-factor monitoring fault tolerance model based on a GPU cluster for big data processing. Information Sciences, 496, 300-316. doi:10.1016/j.ins.2018.04.053Tanweer, S., & Rao, N. (2019). Novel Algorithm of CPU-GPU hybrid system for health care data classification. Journal of Drug Delivery and Therapeutics, 9(1-s), 355-357. doi:10.22270/jddt.v9i1-s.244

    Evaluation of Clustering Algorithms on GPU-Based Edge Computing Platforms

    Get PDF
    [EN] Internet of Things (IoT) is becoming a new socioeconomic revolution in which data and immediacy are the main ingredients. IoT generates large datasets on a daily basis but it is currently considered as "dark data", i.e., data generated but never analyzed. The efficient analysis of this data is mandatory to create intelligent applications for the next generation of IoT applications that benefits society. Artificial Intelligence (AI) techniques are very well suited to identifying hidden patterns and correlations in this data deluge. In particular, clustering algorithms are of the utmost importance for performing exploratory data analysis to identify a set (a.k.a., cluster) of similar objects. Clustering algorithms are computationally heavy workloads and require to be executed on high-performance computing clusters, especially to deal with large datasets. This execution on HPC infrastructures is an energy hungry procedure with additional issues, such as high-latency communications or privacy. Edge computing is a paradigm to enable light-weight computations at the edge of the network that has been proposed recently to solve these issues. In this paper, we provide an in-depth analysis of emergent edge computing architectures that include low-power Graphics Processing Units (GPUs) to speed-up these workloads. Our analysis includes performance and power consumption figures of the latest Nvidia's AGX Xavier to compare the energy-performance ratio of these low-cost platforms with a high-performance cloud-based counterpart version. Three different clustering algorithms (i.e., k-means, Fuzzy Minimals (FM), and Fuzzy C-Means (FCM)) are designed to be optimally executed on edge and cloud platforms, showing a speed-up factor of up to 11x for the GPU code compared to sequential counterpart versions in the edge platforms and energy savings of up to 150% between the edge computing and HPC platforms.This work has been partially supported by the Spanish Ministry of Science and Innovation, under the Ramon y Cajal Program (Grant No. RYC2018-025580-I) and under grants RTI2018-096384-B-I00, RTC-2017-6389-5 and RTC2019-007159-5 and by the Fundacion Seneca del Centro de Coordinacion de la Investigacion de la Region de Murcia under Project 20813/PI/18.Cecilia-Canales, JM.; Cano, J.; Morales-García, J.; Llanes, A.; Imbernón, B. (2020). Evaluation of Clustering Algorithms on GPU-Based Edge Computing Platforms. Sensors. 20(21):1-19. https://doi.org/10.3390/s20216335S1192021Gebauer, H., Fleisch, E., Lamprecht, C., & Wortmann, F. (2020). Growth paths for overcoming the digitalization paradox. Business Horizons, 63(3), 313-323. doi:10.1016/j.bushor.2020.01.005Guillén, M. A., Llanes, A., Imbernón, B., Martínez-España, R., Bueno-Crespo, A., Cano, J.-C., & Cecilia, J. M. (2020). Performance evaluation of edge-computing platforms for the prediction of low temperatures in agriculture using deep learning. The Journal of Supercomputing, 77(1), 818-840. doi:10.1007/s11227-020-03288-wWang, J., Ma, Y., Zhang, L., Gao, R. X., & Wu, D. (2018). Deep learning for smart manufacturing: Methods and applications. Journal of Manufacturing Systems, 48, 144-156. doi:10.1016/j.jmsy.2018.01.003Gretzel, U., Sigala, M., Xiang, Z., & Koo, C. (2015). Smart tourism: foundations and developments. Electronic Markets, 25(3), 179-188. doi:10.1007/s12525-015-0196-8Pramanik, M. I., Lau, R. Y. K., Demirkan, H., & Azad, M. A. K. (2017). Smart health: Big data enabled health paradigm within smart cities. Expert Systems with Applications, 87, 370-383. doi:10.1016/j.eswa.2017.06.027Weber, M., & Podnar Žarko, I. (2019). A Regulatory View on Smart City Services. Sensors, 19(2), 415. doi:10.3390/s19020415Ghosh, A., Chakraborty, D., & Law, A. (2018). Artificial intelligence in Internet of things. CAAI Transactions on Intelligence Technology, 3(4), 208-218. doi:10.1049/trit.2018.1008Monti, L., Vincenzi, M., Mirri, S., Pau, G., & Salomoni, P. (2020). RaveGuard: A Noise Monitoring Platform Using Low-End Microphones and Machine Learning. Sensors, 20(19), 5583. doi:10.3390/s20195583Kumar, P., Sinha, K., Nere, N. K., Shin, Y., Ho, R., Mlinar, L. B., & Sheikh, A. Y. (2020). A machine learning framework for computationally expensive transient models. Scientific Reports, 10(1). doi:10.1038/s41598-020-67546-wMittal, S., & Vetter, J. S. (2015). A Survey of CPU-GPU Heterogeneous Computing Techniques. ACM Computing Surveys, 47(4), 1-35. doi:10.1145/2788396Singh, D., & Reddy, C. K. (2014). A survey on platforms for big data analytics. Journal of Big Data, 2(1). doi:10.1186/s40537-014-0008-6Khayyat, M., Elgendy, I. A., Muthanna, A., Alshahrani, A. S., Alharbi, S., & Koucheryavy, A. (2020). Advanced Deep Learning-Based Computational Offloading for Multilevel Vehicular Edge-Cloud Computing Networks. IEEE Access, 8, 137052-137062. doi:10.1109/access.2020.3011705Satyanarayanan, M. (2017). The Emergence of Edge Computing. Computer, 50(1), 30-39. doi:10.1109/mc.2017.9Capra, M., Peloso, R., Masera, G., Roch, M. R., & Martina, M. (2019). Edge Computing: A Survey On the Hardware Requirements in the Internet of Things World. Future Internet, 11(4), 100. doi:10.3390/fi11040100Lu, H., Gu, C., Luo, F., Ding, W., & Liu, X. (2020). Optimization of lightweight task offloading strategy for mobile edge computing based on deep reinforcement learning. Future Generation Computer Systems, 102, 847-861. doi:10.1016/j.future.2019.07.019Mimmack, G. M., Mason, S. J., & Galpin, J. S. (2001). Choice of Distance Matrices in Cluster Analysis: Defining Regions. Journal of Climate, 14(12), 2790-2797. doi:10.1175/1520-0442(2001)0142.0.co;2Gimenez, C. (2006). Logistics integration processes in the food industry. International Journal of Physical Distribution & Logistics Management, 36(3), 231-249. doi:10.1108/09600030610661813Chang, P.-C., Liu, C.-H., & Fan, C.-Y. (2009). Data clustering and fuzzy neural network for sales forecasting: A case study in printed circuit board industry. Knowledge-Based Systems, 22(5), 344-355. doi:10.1016/j.knosys.2009.02.005Zheng, B., Yoon, S. W., & Lam, S. S. (2014). Breast cancer diagnosis based on feature extraction using a hybrid of K-means and support vector machine algorithms. Expert Systems with Applications, 41(4), 1476-1482. doi:10.1016/j.eswa.2013.08.044Woodley, A., Tang, L.-X., Geva, S., Nayak, R., & Chappell, T. (2019). Parallel K-Tree: A multicore, multinode solution to extreme clustering. Future Generation Computer Systems, 99, 333-345. doi:10.1016/j.future.2018.09.038Kwedlo, W., & Czochanski, P. J. (2019). A Hybrid MPI/OpenMP Parallelization of KK -Means Algorithms Accelerated Using the Triangle Inequality. IEEE Access, 7, 42280-42297. doi:10.1109/access.2019.2907885Liu, B., He, S., He, D., Zhang, Y., & Guizani, M. (2019). A Spark-Based Parallel Fuzzy cc -Means Segmentation Algorithm for Agricultural Image Big Data. IEEE Access, 7, 42169-42180. doi:10.1109/access.2019.2907573Baydoun, M., Ghaziri, H., & Al-Husseini, M. (2018). CPU and GPU parallelized kernel K-means. The Journal of Supercomputing, 74(8), 3975-3998. doi:10.1007/s11227-018-2405-7Li, Y., Zhao, K., Chu, X., & Liu, J. (2013). Speeding up k-Means algorithm by GPUs. Journal of Computer and System Sciences, 79(2), 216-229. doi:10.1016/j.jcss.2012.05.004Cuomo, S., De Angelis, V., Farina, G., Marcellino, L., & Toraldo, G. (2019). A GPU-accelerated parallel K-means algorithm. Computers & Electrical Engineering, 75, 262-274. doi:10.1016/j.compeleceng.2017.12.002Al-Ayyoub, M., Abu-Dalo, A. M., Jararweh, Y., Jarrah, M., & Sa’d, M. A. (2015). A GPU-based implementations of the fuzzy C-means algorithms for medical image segmentation. The Journal of Supercomputing, 71(8), 3149-3162. doi:10.1007/s11227-015-1431-yAit Ali, N., Cherradi, B., El Abbassi, A., Bouattane, O., & Youssfi, M. (2018). GPU fuzzy c-means algorithm implementations: performance analysis on medical image segmentation. Multimedia Tools and Applications, 77(16), 21221-21243. doi:10.1007/s11042-017-5589-6Timón, I., Soto, J., Pérez-Sánchez, H., & Cecilia, J. M. (2016). Parallel implementation of fuzzy minimals clustering algorithm. Expert Systems with Applications, 48, 35-41. doi:10.1016/j.eswa.2015.11.011Cebrian, J. M., Imbernón, B., Soto, J., García, J. M., & Cecilia, J. M. (2020). High-throughput fuzzy clustering on heterogeneous architectures. Future Generation Computer Systems, 106, 401-411. doi:10.1016/j.future.2020.01.022Cecilia, J. M., Timon, I., Soto, J., Santa, J., Pereniguez, F., & Munoz, A. (2018). High-Throughput Infrastructure for Advanced ITS Services: A Case Study on Air Pollution Monitoring. IEEE Transactions on Intelligent Transportation Systems, 19(7), 2246-2257. doi:10.1109/tits.2018.2816741Sriramakrishnan, P., Kalaiselvi, T., & Rajeswaran, R. (2019). Modified local ternary patterns technique for brain tumour segmentation and volume estimation from MRI multi-sequence scans with GPU CUDA machine. Biocybernetics and Biomedical Engineering, 39(2), 470-487. doi:10.1016/j.bbe.2019.02.002Fang, Y., Chen, Q., & Xiong, N. (2019). A multi-factor monitoring fault tolerance model based on a GPU cluster for big data processing. Information Sciences, 496, 300-316. doi:10.1016/j.ins.2018.04.053Rodriguez, M. Z., Comin, C. H., Casanova, D., Bruno, O. M., Amancio, D. R., Costa, L. da F., & Rodrigues, F. A. (2019). Clustering algorithms: A comparative approach. PLOS ONE, 14(1), e0210236. doi:10.1371/journal.pone.0210236Pandove, D., Goel, S., & Rani, R. (2018). Systematic Review of Clustering High-Dimensional and Large Datasets. ACM Transactions on Knowledge Discovery from Data, 12(2), 1-68. doi:10.1145/3132088Bezdek, J. C., Ehrlich, R., & Full, W. (1984). FCM: The fuzzy c-means clustering algorithm. Computers & Geosciences, 10(2-3), 191-203. doi:10.1016/0098-3004(84)90020-7Soto, J., Flores-Sintas, A., & Palarea-Albaladejo, J. (2008). Improving probabilities in a fuzzy clustering partition. Fuzzy Sets and Systems, 159(4), 406-421. doi:10.1016/j.fss.2007.08.016Kolen, J. F., & Hutcheson, T. (2002). Reducing the time complexity of the fuzzy c-means algorithm. IEEE Transactions on Fuzzy Systems, 10(2), 263-267. doi:10.1109/91.99512

    METADOCK 2: a high-throughput parallel metaheuristic scheme for molecular docking

    Full text link
    [EN] Motivation Molecular docking methods are extensively used to predict the interaction between protein-ligand systems in terms of structure and binding affinity, through the optimization of a physics-based scoring function. However, the computational requirements of these simulations grow exponentially with: (i) the global optimization procedure, (ii) the number and degrees of freedom of molecular conformations generated and (iii) the mathematical complexity of the scoring function. Results In this work, we introduce a novel molecular docking method named METADOCK 2, which incorporates several novel features, such as (i) a ligand-dependent blind docking approach that exhaustively scans the whole protein surface to detect novel allosteric sites, (ii) an optimization method to enable the use of a wide branch of metaheuristics and (iii) a heterogeneous implementation based on multicore CPUs and multiple graphics processing units. Two representative scoring functions implemented in METADOCK 2 are extensively evaluated in terms of computational performance and accuracy using several benchmarks (such as the well-known DUD) against AutoDock 4.2 and AutoDock Vina. Results place METADOCK 2 as an efficient and accurate docking methodology able to deal with complex systems where computational demands are staggering and which outperforms both AutoDock Vina and AutoDock 4.This work was partially supported by the Fundación Séneca del Centro de Coordinación de la Investigación de la Región de Murcia [Projects 20813/PI/ 18, 20988/PI/18, 20524/PDC/18] and by the Spanish Ministry of Science, Innovation and Universities [TIN2016-78799-P (AEI/FEDER, UE), CTQ2017-87974-R]. The authors thankfully acknowledge the computer resources at CTE-POWER and the technical support provided by Barcelona Supercomputing Center - Centro Nacional de Supercomputación [RES-BCV2018-3-0008].Imbernón, B.; Serrano, A.; Bueno-Crespo, A.; Abellán, JL.; Pérez-Sánchez, H.; Cecilia-Canales, JM. (2020). METADOCK 2: a high-throughput parallel metaheuristic scheme for molecular docking. Bioinformatics. 1-6. https://doi.org/10.1093/bioinformatics/btz958S16Bianchi, L., Dorigo, M., Gambardella, L. M., & Gutjahr, W. J. (2008). A survey on metaheuristics for stochastic combinatorial optimization. Natural Computing, 8(2), 239-287. doi:10.1007/s11047-008-9098-4Cecilia, J. M., Llanes, A., Abellán, J. L., Gómez-Luna, J., Chang, L.-W., & Hwu, W.-M. W. (2018). High-throughput Ant Colony Optimization on graphics processing units. Journal of Parallel and Distributed Computing, 113, 261-274. doi:10.1016/j.jpdc.2017.12.002Desiraju, G., & Steiner, T. (2001). The Weak Hydrogen Bond. doi:10.1093/acprof:oso/9780198509707.001.0001Eisenberg, D., & McLachlan, A. D. (1986). Solvation energy in protein folding and binding. Nature, 319(6050), 199-203. doi:10.1038/319199a0Ewing, T. J. A., Makino, S., Skillman, A. G., & Kuntz, I. D. (2001). Journal of Computer-Aided Molecular Design, 15(5), 411-428. doi:10.1023/a:1011115820450Friesner, R. A., Banks, J. L., Murphy, R. B., Halgren, T. A., Klicic, J. J., Mainz, D. T., … Shenkin, P. S. (2004). Glide:  A New Approach for Rapid, Accurate Docking and Scoring. 1. Method and Assessment of Docking Accuracy. Journal of Medicinal Chemistry, 47(7), 1739-1749. doi:10.1021/jm0306430Guerrero, G. D., Imbernón, B., Pérez-Sánchez, H., Sanz, F., García, J. M., & Cecilia, J. M. (2014). A Performance/Cost Evaluation for a GPU-Based Drug Discovery Application on Volunteer Computing. BioMed Research International, 2014, 1-8. doi:10.1155/2014/474219Hauser, A. S., & Windshügel, B. (2016). LEADS-PEP: A Benchmark Data Set for Assessment of Peptide Docking Performance. Journal of Chemical Information and Modeling, 56(1), 188-200. doi:10.1021/acs.jcim.5b00234Llanes, A., Muñoz, A., Bueno-Crespo, A., García-Valverde, T., Sánchez, A., Arcas-Túnez, F., … M. Cecilia, J. (2016). Soft Computing Techniques for the Protein Folding Problem on High Performance Computing Architectures. Current Drug Targets, 17(14), 1626-1648. doi:10.2174/1389450117666160201114028McIntosh-Smith, S., Price, J., Sessions, R. B., & Ibarra, A. A. (2014). High performance in silico virtual drug screening on many-core processors. The International Journal of High Performance Computing Applications, 29(2), 119-134. doi:10.1177/1094342014528252Mehler, E. L., & Solmajer, T. (1991). Electrostatic effects in proteins: comparison of dielectric and charge models. «Protein Engineering, Design and Selection», 4(8), 903-910. doi:10.1093/protein/4.8.903Morris, G. M., Goodsell, D. S., Halliday, R. S., Huey, R., Hart, W. E., Belew, R. K., & Olson, A. J. (1998). Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function. Journal of Computational Chemistry, 19(14), 1639-1662. doi:10.1002/(sici)1096-987x(19981115)19:143.0.co;2-bMysinger, M. M., Carchia, M., Irwin, J. J., & Shoichet, B. K. (2012). Directory of Useful Decoys, Enhanced (DUD-E): Better Ligands and Decoys for Better Benchmarking. Journal of Medicinal Chemistry, 55(14), 6582-6594. doi:10.1021/jm300687eO’Boyle, N. M., Banck, M., James, C. A., Morley, C., Vandermeersch, T., & Hutchison, G. R. (2011). Open Babel: An open chemical toolbox. Journal of Cheminformatics, 3(1). doi:10.1186/1758-2946-3-33Sakurai, Y., Kolokoltsov, A. A., Chen, C.-C., Tidwell, M. W., Bauta, W. E., Klugbauer, N., … Davey, R. A. (2015). Two-pore channels control Ebola virus host cell entry and are drug targets for disease treatment. Science, 347(6225), 995-998. doi:10.1126/science.1258758Sánchez-Linares, I., Pérez-Sánchez, H., Cecilia, J. M., & García, J. M. (2012). High-Throughput parallel blind Virtual Screening using BINDSURF. BMC Bioinformatics, 13(S14). doi:10.1186/1471-2105-13-s14-s13Sliwoski, G., Kothiwale, S., Meiler, J., & Lowe, E. W. (2013). Computational Methods in Drug Discovery. Pharmacological Reviews, 66(1), 334-395. doi:10.1124/pr.112.007336Sörensen, K. (2013). Metaheuristics-the metaphor exposed. International Transactions in Operational Research, 22(1), 3-18. doi:10.1111/itor.12001Yuan, S., Chan, J. F.-W., den-Haan, H., Chik, K. K.-H., Zhang, A. J., Chan, C. C.-S., … Yuen, K.-Y. (2017). Structure-based discovery of clinically approved drugs as Zika virus NS2B-NS3 protease inhibitors that potently inhibit Zika virus infection in vitro and in vivo. Antiviral Research, 145, 33-43. doi:10.1016/j.antiviral.2017.07.00

    A Performance/Cost Evaluation for a GPU-Based Drug Discovery Application on Volunteer Computing

    Get PDF
    Bioinformatics is an interdisciplinary research field that develops tools for the analysis of large biological databases, and, thus, the use of high performance computing (HPC) platforms is mandatory for the generation of useful biological knowledge. The latest generation of graphics processing units (GPUs) has democratized the use of HPC as they push desktop computers to cluster-level performance. Many applications within this field have been developed to leverage these powerful and low-cost architectures. However, these applications still need to scale to larger GPU-based systems to enable remarkable advances in the fields of healthcare, drug discovery, genome research, etc. The inclusion of GPUs in HPC systems exacerbates power and temperature issues, increasing the total cost of ownership (TCO). This paper explores the benefits of volunteer computing to scale bioinformatics applications as an alternative to own large GPU-based local infrastructures. We use as a benchmark a GPU-based drug discovery application called BINDSURF that their computational requirements go beyond a single desktop machine. Volunteer computing is presented as a cheap and valid HPC system for those bioinformatics applications that need to process huge amounts of data and where the response time is not a critical factor.Ingeniería, Industria y Construcció

    Evaluation of the 3-D finite difference implementation of the acoustic diffusion equation model on massively parallel architectures

    Get PDF
    The diffusion equation model is a popular tool in room acoustics modeling. The 3-D Finite Difference (3D-FD) implementation predicts the energy decay function and the sound pressure level in closed environments. This simulation is computationally expensive, as it depends on the resolution used to model the room. With such high computational requirements, a high-level programming language (e.g., Matlab) cannot deal with real life scenario simulations. Thus, it becomes mandatory to use our computational resources more efficiently. Manycore architectures, such as NVIDIA GPUs or Intel Xeon Phi offer new opportunities to enhance scientific computations, increasing the performance per watt, but shifting to a different programming model. This paper shows the roadmap to use massively parallel architectures in a 3D-FD simulation. We evaluate the latest generation of NVIDIA and Intel architectures. Our experimental results reveal that NVIDIA architectures outperform by a wide margin the Intel Xeon Phi co-processor while dissipating approximately 50 W less (25%) for large-scale input problems.Ingeniería, Industria y Construcció

    Maximizing resource usage in multifold molecular dynamics with rCUDA

    Get PDF
    [EN] The full-understanding of the dynamics of molecular systems at the atomic scale is of great relevance in the fields of chemistry, physics, materials science, and drug discovery just to name a few. Molecular dynamics (MD) is a widely used computer tool for simulating the dynamical behavior of molecules. However, the computational horsepower required by MD simulations is too high to obtain conclusive results in real-world scenarios. This is mainly motivated by two factors: (1) the long execution time required by each MD simulation (usually in the nanoseconds and microseconds scale, and beyond) and (2) the large number of simulations required in drug discovery to study the interactions between a large library of compounds and a given protein target. To deal with the former, graphics processing units (GPUs) have come up into the scene. The latter has been traditionally approached by launching large amounts of simulations in computing clusters that may contain several GPUs on each node. However, GPUs are targeted as a single node that only runs one MD instance at a time, which translates into low GPU occupancy ratios and therefore low throughput. In this work, we propose a strategy to increase the overall throughput of MD simulations by increasing the GPU occupancy through virtualized GPUs. We use the remote CUDA (rCUDA) middleware as a tool to decouple GPUs from CPUs, and thus enabling multi-tenancy of the virtual GPUs. As a working test in the drug discovery field, we studied the binding process of a novel flavonol to DNA with the GROningen MAchine for Chemical Simulations (GROMACS) MD package. Our results show that the use of rCUDA provides with a 1.21x speed-up factor compared to the CUDA counterpart version while requiring a similar power budget.The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was jointly supported by the Fundación Séneca (Agencia Regional de Ciencia y Tecnología, Región de Murcia) under grants (20524/PDC/18, 20813/PI/ 18, and 20988/PI/18) and by the Spanish MEC and Eur-opean Commission FEDER under grants TIN2015-66972-C5-3-R, TIN2016-78799-P, and CTQ2017-87974-R (AEI/FEDER, UE). Researchers from the Universitat Politècnica de València are supported by the Generalitat Valenciana under grant PROMETEO/2017/077.Prades, J.; Imbernon, B.; Reaño González, C.; Peña-García, J.; Cerón-Carrasco, JP.; Silla Jiménez, F.; Pérez-Sánchez, H. (2020). Maximizing resource usage in multifold molecular dynamics with rCUDA. International Journal of High Performance Computing Applications. 34(1):5-19. https://doi.org/10.1177/1094342019857131S519341Abraham, M. J., Murtola, T., Schulz, R., Páll, S., Smith, J. C., Hess, B., & Lindahl, E. (2015). GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX, 1-2, 19-25. doi:10.1016/j.softx.2015.06.001Banegas-Luna, A. J., Imbernón, B., Llanes Castro, A., Pérez-Garrido, A., Cerón-Carrasco, J. P., Gesing, S., … Pérez-Sánchez, H. (2018). Advances in distributed computing with modern drug discovery. Expert Opinion on Drug Discovery, 14(1), 9-22. doi:10.1080/17460441.2019.1552936Case, D. A., Cheatham, T. E., Darden, T., Gohlke, H., Luo, R., Merz, K. M., … Woods, R. J. (2005). The Amber biomolecular simulation programs. Journal of Computational Chemistry, 26(16), 1668-1688. doi:10.1002/jcc.20290Csermely, P., Korcsmáros, T., Kiss, H. J. M., London, G., & Nussinov, R. (2013). Structure and dynamics of molecular networks: A novel paradigm of drug discovery. Pharmacology & Therapeutics, 138(3), 333-408. doi:10.1016/j.pharmthera.2013.01.016Franco, A. A. (2013). Multiscale modelling and numerical simulation of rechargeable lithium ion batteries: concepts, methods and challenges. RSC Advances, 3(32), 13027. doi:10.1039/c3ra23502eFrisch MJ, Trucks GW, Schlegel HB, et al. (2016) Gaussian 16 Revision A.03. Wallingford, CT: Gaussian. Inc.Halder, D., & Purkayastha, P. (2018). A flavonol that acts as a potential DNA minor groove binder as also an efficient G-quadruplex loop binder. Journal of Molecular Liquids, 265, 69-76. doi:10.1016/j.molliq.2018.05.117Hess, B., Kutzner, C., van der Spoel, D., & Lindahl, E. (2008). GROMACS 4:  Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation. Journal of Chemical Theory and Computation, 4(3), 435-447. doi:10.1021/ct700301qHornak, V., Abel, R., Okur, A., Strockbine, B., Roitberg, A., & Simmerling, C. (2006). Comparison of multiple Amber force fields and development of improved protein backbone parameters. Proteins: Structure, Function, and Bioinformatics, 65(3), 712-725. doi:10.1002/prot.21123Imbernón, B., Cecilia, J. M., Pérez-Sánchez, H., & Giménez, D. (2017). METADOCK: A parallel metaheuristic schema for virtual screening methods. The International Journal of High Performance Computing Applications, 32(6), 789-803. doi:10.1177/1094342017697471Iserte, S., Prades, J., Reano, C., & Silla, F. (2016). Increasing the Performance of Data Centers by Combining Remote GPU Virtualization with Slurm. 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid). doi:10.1109/ccgrid.2016.26Bentham Science Publisher, B. S. P. (2006). Scoring Functions for Protein-Ligand Docking. Current Protein & Peptide Science, 7(5), 407-420. doi:10.2174/138920306778559395Jorgensen, W. L., Chandrasekhar, J., Madura, J. D., Impey, R. W., & Klein, M. L. (1983). Comparison of simple potential functions for simulating liquid water. The Journal of Chemical Physics, 79(2), 926-935. doi:10.1063/1.445869Kitchen, D. B., Decornez, H., Furr, J. R., & Bajorath, J. (2004). Docking and scoring in virtual screening for drug discovery: methods and applications. Nature Reviews Drug Discovery, 3(11), 935-949. doi:10.1038/nrd1549Lagarde, N., Zagury, J.-F., & Montes, M. (2015). Benchmarking Data Sets for the Evaluation of Virtual Ligand Screening Methods: Review and Perspectives. Journal of Chemical Information and Modeling, 55(7), 1297-1307. doi:10.1021/acs.jcim.5b00090Noroozi, M., Angerson, W. J., & Lean, M. E. (1998). Effects of flavonoids and vitamin C on oxidative DNA damage to human lymphocytes. The American Journal of Clinical Nutrition, 67(6), 1210-1218. doi:10.1093/ajcn/67.6.1210Patra, M., Hyvönen, M. T., Falck, E., Sabouri-Ghomi, M., Vattulainen, I., & Karttunen, M. (2007). Long-range interactions and parallel scalability in molecular simulations. Computer Physics Communications, 176(1), 14-22. doi:10.1016/j.cpc.2006.07.017Pezeshgi Modarres, H., Dorokhov, B. D., Popov, V. O., Ravin, N. V., Skryabin, K. G., & Dal Peraro, M. (2015). Understanding and Engineering Thermostability in DNA Ligase from Thermococcus sp. 1519. Biochemistry, 54(19), 3076-3085. doi:10.1021/bi501227bPhillips, J. C., Braun, R., Wang, W., Gumbart, J., Tajkhorshid, E., Villa, E., … Schulten, K. (2005). Scalable molecular dynamics with NAMD. Journal of Computational Chemistry, 26(16), 1781-1802. doi:10.1002/jcc.20289Prades, J., Reaño, C., Silla, F., Imbernón, B., Pérez-Sánchez, H., & Cecilia, J. M. (2018). Increasing Molecular Dynamics Simulations Throughput by Virtualizing Remote GPUs with rCUDA. Proceedings of the 47th International Conference on Parallel Processing Companion - ICPP ’18. doi:10.1145/3229710.3229734Pronk, S., Páll, S., Schulz, R., Larsson, P., Bjelkmar, P., Apostolov, R., … Lindahl, E. (2013). GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit. Bioinformatics, 29(7), 845-854. doi:10.1093/bioinformatics/btt055Reano, C., & Silla, F. (2015). A Performance Comparison of CUDA Remote GPU Virtualization Frameworks. 2015 IEEE International Conference on Cluster Computing. doi:10.1109/cluster.2015.76Reaño, C., Silla, F., Shainer, G., & Schultz, S. (2015). Local and Remote GPUs Perform Similar with EDR 100G InfiniBand. Proceedings of the Industrial Track of the 16th International Middleware Conference on ZZZ - Middleware Industry ’15. doi:10.1145/2830013.2830015Sánchez-Linares, I., Pérez-Sánchez, H., Cecilia, J. M., & García, J. M. (2012). High-Throughput parallel blind Virtual Screening using BINDSURF. BMC Bioinformatics, 13(Suppl 14), S13. doi:10.1186/1471-2105-13-s14-s13Shaw, D. E., Maragakis, P., Lindorff-Larsen, K., Piana, S., Dror, R. O., Eastwood, M. P., … Wriggers, W. (2010). Atomic-Level Characterization of the Structural Dynamics of Proteins. Science, 330(6002), 341-346. doi:10.1126/science.1187409Yoo, A. B., Jette, M. A., & Grondona, M. (2003). SLURM: Simple Linux Utility for Resource Management. Lecture Notes in Computer Science, 44-60. doi:10.1007/10968987_
    corecore